AITopics | side length

Collaborating Authors

side length

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Thinker: Learning to Think Fast and Slow

Neural Information Processing SystemsJun-19-2026, 22:39:14 GMT

Recent studies show that the reasoning capabilities of Large Language Models (LLMs) can be improved by applying Reinforcement Learning (RL) to questionanswering (QA) tasks in areas such as math and coding. With a long context length, LLMs may learn to perform search, as indicated by the self-correction behavior observed in DeepSeek R1. However, this search behavior is often imprecise and lacks confidence, resulting in long, redundant responses and highlighting deficiencies in intuition and verification. Inspired by the Dual Process Theory in psychology, we introduce a simple modification to the QA task that includes four stages: Fast Thinking, where the LLM must answer within a strict token budget; Verification, where the model evaluates its initial response; Slow Thinking, where it refines the initial response with more deliberation; and Summarization, where it distills the refinement from the previous stage into precise steps. Our proposed task improves average accuracy from 25.6% to 27.3% for Qwen2.5-1.5B, and from 45.9% to 51.0% for DeepSeek-R1-Qwen-1.5B. Notably, for Qwen2.5-1.5B, the Fast Thinking mode alone achieves 25.2% accuracy using fewer than 1000 tokens, demonstrating substantial inference efficiency gains. These findings suggest that intuition and deliberative reasoning are distinct, complementary systems benefiting from targeted training. Additionally, we have open-sourced both the trained models and the source code.

large language model, machine learning, reinforcement learning, (22 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Thinker: Learning to Think Fast and Slow

Chung, Stephen, Du, Wenyu, Fu, Jie

arXiv.org Artificial IntelligenceOct-17-2025

Recent studies show that the reasoning capabilities of Large Language Models (LLMs) can be improved by applying Reinforcement Learning (RL) to question-answering (QA) tasks in areas such as math and coding. With a long context length, LLMs may learn to perform search, as indicated by the self-correction behavior observed in DeepSeek R1. However, this search behavior is often imprecise and lacks confidence, resulting in long, redundant responses and highlighting deficiencies in intuition and verification. Inspired by the Dual Process Theory in psychology, we introduce a simple modification to the QA task that includes four stages: Fast Thinking, where the LLM must answer within a strict token budget; Verification, where the model evaluates its initial response; Slow Thinking, where it refines the initial response with more deliberation; and Summarization, where it distills the refinement from the previous stage into precise steps. Our proposed task improves average accuracy from 25.6% to 27.3% for Qwen2.5-1.5B, and from 45.9% to 51.0% for DeepSeek-R1-Qwen-1.5B. Notably, for Qwen2.5-1.5B, the Fast Thinking mode alone achieves 25.2% accuracy using fewer than 1000 tokens, demonstrating substantial inference efficiency gains. These findings suggest that intuition and deliberative reasoning are distinct, complementary systems benefiting from targeted training. Additionally, we have open-sourced both the trained models and the source code.

large language model, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2505.21097

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

A formula for the area of a triangle: Useless, but explicitly in Deep Sets form

Hainje, Connor, Hogg, David W.

arXiv.org Machine LearningMar-28-2025

Any permutation-invariant function of data points $\vec{r}_i$ can be written in the form $\rho(\sum_i\phi(\vec{r}_i))$ for suitable functions $\rho$ and $\phi$. This form - known in the machine-learning literature as Deep Sets - also generates a map-reduce algorithm. The area of a triangle is a permutation-invariant function of the locations $\vec{r}_i$ of the three corners $1\leq i\leq 3$. We find the polynomial formula for the area of a triangle that is explicitly in Deep Sets form. This project was motivated by questions about the fundamental computational complexity of $n$-point statistics in cosmology; that said, no insights of any kind were gained from these results.

deep set form, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2503.22786

Country: North America > United States > New York (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Gao, Zhaolin, Zhan, Wenhao, Chang, Jonathan D., Swamy, Gokul, Brantley, Kianté, Lee, Jason D., Sun, Wen

arXiv.org Artificial IntelligenceOct-6-2024

Large Language Models (LLMs) have achieved remarkable success at tasks like summarization that involve a single turn of interaction. However, they can still struggle with multi-turn tasks like dialogue that require long-term planning. Previous works on multi-turn dialogue extend single-turn reinforcement learning from human feedback (RLHF) methods to the multi-turn setting by treating all prior dialogue turns as a long context. Such approaches suffer from covariate shift: the conversations in the training set have previous turns generated by some reference policy, which means that low training error may not necessarily correspond to good performance when the learner is actually in the conversation loop. In response, we introduce REgressing the RELative FUture (REFUEL), an efficient policy optimization approach designed to address multi-turn RLHF in LLMs. REFUEL employs a single model to estimate $Q$-values and trains on self-generated data, addressing the covariate shift issue. REFUEL frames the multi-turn RLHF problem as a sequence of regression tasks on iteratively collected datasets, enabling ease of implementation. Theoretically, we prove that REFUEL can match the performance of any policy covered by the training set. Empirically, we evaluate our algorithm by using Llama-3.1-70B-it to simulate a user in conversation with our model. REFUEL consistently outperforms state-of-the-art methods such as DPO and REBEL across various settings. Furthermore, despite having only 8 billion parameters, Llama-3-8B-it fine-tuned with REFUEL outperforms Llama-3.1-70B-it on long multi-turn dialogues. Implementation of REFUEL can be found at https://github.com/ZhaolinGao/REFUEL/, and models trained by REFUEL can be found at https://huggingface.co/Cornell-AGI.

side length, square tile, tile size, (16 more...)

arXiv.org Artificial Intelligence

2410.04612

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.05)
North America > United States > Rhode Island (0.04)
(6 more...)

Genre:

Workflow (0.93)
Research Report (0.84)

Industry: Banking & Finance > Real Estate (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning

Zhao, Jun, Tong, Jingqi, Mou, Yurong, Zhang, Ming, Zhang, Qi, Huang, Xuanjing

arXiv.org Artificial IntelligenceJul-11-2024

Human cognition exhibits systematic compositionality, the algebraic ability to generate infinite novel combinations from finite learned components, which is the key to understanding and reasoning about complex logic. In this work, we investigate the compositionality of large language models (LLMs) in mathematical reasoning. Specifically, we construct a new dataset \textsc{MathTrap}\footnotemark[3] by introducing carefully designed logical traps into the problem descriptions of MATH and GSM8k. Since problems with logical flaws are quite rare in the real world, these represent ``unseen'' cases to LLMs. Solving these requires the models to systematically compose (1) the mathematical knowledge involved in the original problems with (2) knowledge related to the introduced traps. Our experiments show that while LLMs possess both components of requisite knowledge, they do not \textbf{spontaneously} combine them to handle these novel cases. We explore several methods to mitigate this deficiency, such as natural language prompts, few-shot demonstrations, and fine-tuning. We find that LLMs' performance can be \textbf{passively} improved through the above external intervention. Overall, systematic compositionality remains an open challenge for large language models.

equilateral triangle, trap problem, triangle, (16 more...)

arXiv.org Artificial Intelligence

2405.0668

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

$\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning

Juneja, Gurusha, Dutta, Subhabrata, Chakraborty, Tanmoy

arXiv.org Artificial IntelligenceApr-2-2024

Despite demonstrating emergent reasoning abilities, Large Language Models (LLMS) often lose track of complex, multi-step reasoning. Existing studies show that providing guidance via decomposing the original question into multiple subproblems elicits more robustness in LLM reasoning -- a decomposer generates the subproblems, and a solver solves each of these subproblems. However, these techniques fail to accommodate coordination between the decomposer and the solver modules (either in a single model or different specialized ones) -- the decomposer does not keep track of the ability of the solver to follow the decomposed reasoning. In this paper, we propose LM2 to address these challenges. LM2 modularizes the decomposition, solution, and verification into three different language models. The decomposer module identifies the key concepts necessary to solve the problem and generates step-by-step subquestions according to the reasoning requirement. The solver model generates the solution to the subproblems that are then checked by the verifier module; depending upon the feedback from the verifier, the reasoning context is constructed using the subproblems and the solutions. These models are trained to coordinate using policy learning. Exhaustive experimentation suggests the superiority of LM2 over existing methods on in- and out-domain reasoning problems, outperforming the best baselines by $8.1\%$ on MATH, $7.71\%$ on JEEBench, and $9.7\%$ on MedQA problems (code available at https://github.com/LCS2-IIITD/Language_Model_Multiplex).

language model, lm 2, verifier, (16 more...)

arXiv.org Artificial Intelligence

2404.02255

Country:

Asia > India > NCT > Delhi (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Magnetic Field Prediction Using Generative Adversarial Networks

Pollok, Stefan, Olden-Jørgensen, Nataniel, Jørgensen, Peter Stanley, Bjørk, Rasmus

arXiv.org Artificial IntelligenceMar-14-2022

Plenty of scientific and real-world applications are built on magnetic fields and their characteristics. To retrieve the valuable magnetic field information in high resolution, extensive field measurements are required, which are either time-consuming to conduct or even not feasible due to physical constraints. To alleviate this problem, we predict magnetic field values at a random point in space from a few point measurements by using a generative adversarial network (GAN) structure. The deep learning (DL) architecture consists of two neural networks: a generator, which predicts missing field values of a given magnetic field, and a critic, which is trained to calculate the statistical distance between real and generated magnetic field distributions. By minimizing this statistical distance, a reconstruction loss as well as physical losses, our trained generator has learned to predict the missing field values with a median reconstruction test error of 5.14%, when a single coherent region of field points is missing, and 5.86%, when only a few point measurements in space are available and the field measurements around are predicted. We verify the results on an experimentally validated field.

artificial intelligence, machine learning, magnetic field, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jmmm.2023.170556

2203.07897

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre: Research Report > Promising Solution (0.46)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Self-Managing Associative Memory for Dynamic Acquisition of Expertise in High-Level Domains

Beal, Jacob (BBN Technologies)

AAAI ConferencesJun-23-2009

Self-organizing maps can be used to implement an associative memory for an intelligent system that dynamically learns about new high-level domains over time. SOMs are an attractive option for implementing associative memory: they are fast, easily parallelized, and digest a stream of incoming data into a topographically organized collection of models where more frequent classes of data are represented by higher-resolution collections of models. Typically, the distribution of models in an SOM, once developed, remains fairly stable, but developing expertise in a new high-level domain requires altering the allocation of models. We use a mixture of analysis and empirical studies to characterize the behavior of SOMs for high-level associative memory, finding that new high-resolution collections of models develop quickly. High-resolution areas of the SOM decay rapidly unless actively refreshed, but in a large SOM, the ratio between growth rate and decay rate may be high enough to support both fast learning and long-term memory.

Add feedback